Power Query常用自定义M函数 v1.0

611Views

共计 9582 个字符，预计需要花费 24 分钟才能阅读完成。

记录较为常用的自定义函数，更新于2021/11/29 22:22
（临时文章，暂存）

文本函数

语法

//此函数完全重写于Text.Contains函数，为了函数名称结构相同，方便管理
(text as nullable text, substring as text, optional comparer as nullable function) as nullable logical =>
    Text.Contains(text, substring, comparer)

关于

检测 text 是否包含值 substring 。如果找到文本，则返回 true。此函数不支持通配符或正则表达式。

可选参数 comparer 可用于指定不区分大小写或区分区域性和区域设置的比较。以下内置比较器支持公式语言：

Comparer.Ordinal：用于执行区分大小写的序号比较
Comparer.OrdinalIgnoreCase：用于执行不区分大小写的序号比较
Comparer.FromCulture：用于执行区分区域性的比较

参数	类型	说明
text	`可选` \| text \| 示例：abc	被搜索文本
substring	`不可选` \| text \| 示例：abc	搜索文本
comparer	`可选` \| function \| 示例：比较器函数	比较器函数

示例 1

查找文本“Hello World”是否包含“Hello”。

Text_Contains("Hello World", "Hello")

true

示例 2

查找文本“Hello World”是否包含“hello”。

Text_Contains("Hello World", "hello")

false

示例 3

使用不区分大小写的比较器查找文本“Hello World”是否包含“Hello”。

Text_Contains("Hello World", "hello", Comparer.OrdinalIgnoreCase)

true

语法

(text as nullable text, substrings as list, optional comparer as nullable function) as nullable logical =>
    List.AnyTrue(List.Transform(substrings, each Text.Contains(text, _, comparer)))

关于

检测 text 是否包含 substring 列表里的任意值。如果找到文本，则返回 true。此函数不支持通配符或正则表达式。

可选参数 comparer 可用于指定不区分大小写或区分区域性和区域设置的比较。以下内置比较器支持公式语言：

Comparer.Ordinal：用于执行区分大小写的序号比较
Comparer.OrdinalIgnoreCase：用于执行不区分大小写的序号比较
Comparer.FromCulture：用于执行区分区域性的比较

参数	类型	说明
text	`可选` \| text \| 示例：abc	被搜索文本
substring	`不可选` \| List \| 示例：{"a","b","c"}	搜索文本List
comparer	`可选` \| function \| 示例：比较器函数	比较器函数

示例 1

查找文本“Hello World”是否包含“Hello” 或 “hello”。

Text_ContainsAny("Hello World", {"Hello","hello"})

true

示例 2

查找文本“Hello World”是否包含“hello” 或 “world”。

Text_ContainsAny("Hello World", {"hello","world"})

false

示例 3

使用不区分大小写的比较器查找文本“Hello World”是否包含“hello” 或 “world”。

Text_ContainsAny("Hello World", {"hello","world"},Comparer.OrdinalIgnoreCase)

true

语法

//此函数重写于List.ContainsAny函数，可判断任意类型（不止于文本类型的判断）
(value as nullable any,substring as list) as nullable logical =>
    List.ContainsAny({value},substring) 
//第三个参数加上Text.Contains会变成只能判断文本类型

关于

检测 text 是否等于 substring 列表里的任意值。如果找到文本，则返回 true。此函数不支持通配符或正则表达式。

参数	类型	说明
text	`可选` \| any\| 示例：abc	被搜索文本或值
substring	`不可选` \| List \| 示例：{"a","b","c"}	搜索文本List

示例 1

查找文本“Hello World”是否等于“Hello World” 或 “hello”。

Text_In("Hello World", {"Hello World","hello"})

true

示例 2

查找文本“Hello World”是否等于“hello world” 或 “World”。

Text_ContainsAny("Hello World", {"hello world","World"})

false

语法

// 计算子串在文本中的出现频率。
(text as nullable text, substring as text) as number =>
    let
        splitText = Text.Split(text, substring),
        count = List.Count(splitText) - 1
    in
        count

关于

检测 text 里面出现了 substring 多少次。返回 number

参数	类型	说明
text	`可选` \| text \| 示例：abc	被搜索文本
substring	`不可选` \| text \| 示例：{"a","b","c"}	需要计算频次的文本

示例 1

查找文本“Hello World”出现了多少次文本"l"。

Text_Frequency("Hello World","l")

3

语法

// 计算列表中每个子串在文本中的出现频率，返回记录列表。
(text as nullable text, substrings as list) as list =>
    let
        counts = List.Transform(substrings, each [key=_, num=Text.Frequency(text, _)])
    in
        counts

关于

检测 text 里面 substring 列表里面的每个文本出现的次数。返回Record列表

参数	类型	说明
text	`可选` \| text \| 示例：abc	被搜索文本
substring	`不可选` \| list \| 示例：{"a","b","c"}	需要计算频次的文本列表

示例 1

查找文本“Hello World”中出现了多少次“l”和“o”。

Text_FrequencyAll("Hello World",{"l","o"})

[key = "l" , num = 3 ],[key = "o" , num = 2 ]

语法

(text as nullable text,substring as list) as text =>
[
    len = Text.Length(text),
    list = Text.Combine(List.RemoveNulls(List.Transform(substring,(key)=>
            [
                num = Text.Length(Text.Replace(text,key,"."&key))-len,
                res = Text.Combine({key,Number.ToText(num)},"：")
            ][res]
        )),",")
][list]

关于

检测 text 里面 substring 列表里面的每个文本出现的次数。返回合并列表后的词频文本

参数	类型	说明
text	`可选` \| text \| 示例：abc	被搜索文本
substring	`不可选` \| list \| 示例：{"a","b","c"}	需要计算频次的文本列表

示例 1

查找文本“Hello World”中出现了多少次“l”和“o”。

Text_FrequencyAll("Hello World",{"l","o"})

l：3,o：2

语法

//二进制ZIP文件
(CnBinary) =>[

        Header = BinaryFormat.Record([
            MiscHeader = BinaryFormat.Binary(14),
            BinarySize = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            FileSize   = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            FileNameLen= BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
            ExtrasLen  = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian)    
        ]),

        HeaderChoice = BinaryFormat.Choice(
            BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            each if _ <> 67324752
                then BinaryFormat.Record([IsValid = false, Filename=null, Content=null])
                else BinaryFormat.Choice(
                        BinaryFormat.Binary(26),
                        each BinaryFormat.Record([
                            IsValid  = true,
                            Filename = BinaryFormat.Text(Header(_)[FileNameLen],BinaryEncoding.Base64),
                            Extras   = BinaryFormat.Text(Header(_)[ExtrasLen]),
                            Content  = BinaryFormat.Transform(
                                BinaryFormat.Binary(Header(_)[BinarySize]),
                                (x) => try Binary.Buffer(Binary.Decompress(x, Compression.Deflate)) otherwise null
                            )
                            ]),
                            type binary
                    )
        ),

        ZipFormat = BinaryFormat.List(HeaderChoice, each _[IsValid] = true),

        Entries = List.Transform(
            List.RemoveLastN( ZipFormat(CnBinary), 1),
            (e) => [FileName = e[Filename], Content = e[Content] ]
        ),

        Results = Table.FromRecords(Entries)

    ][Results]

关于

获取 Zip 压缩包里面的文件内容。返回文件列表

参数	类型	说明
CnBinary	`可选` \| Binary	需要解压的ZIP文件

语法

//文件夹路径
(path) =>[
        Header = BinaryFormat.Record([
            MiscHeader = BinaryFormat.Binary(14),
            BinarySize = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            FileSize   = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            FileNameLen= BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
            ExtrasLen  = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian)    
        ]),

        HeaderChoice = BinaryFormat.Choice(
            BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
            each if _ <> 67324752
                then BinaryFormat.Record([IsValid = false, Filename=null, Content=null])
                else BinaryFormat.Choice(
                        BinaryFormat.Binary(26),
                        each BinaryFormat.Record([
                            IsValid  = true,
                            Filename = BinaryFormat.Text(Header(_)[FileNameLen],BinaryEncoding.Base64),
                            Extras   = BinaryFormat.Text(Header(_)[ExtrasLen]),
                            Content  = BinaryFormat.Transform(
                                BinaryFormat.Binary(Header(_)[BinarySize]),
                                (x) => try Binary.Buffer(Binary.Decompress(x, Compression.Deflate)) otherwise null
                            )
                            ]),
                            type binary
                    )
        ),

        ZipFormat = BinaryFormat.List(HeaderChoice, each _[IsValid] = true),

        Entries = List.Transform(
            List.RemoveLastN( ZipFormat(File.Contents(path)), 1),
            (e) => [FileName = e[Filename], Content = e[Content] ]
        ),

        Results = Table.FromRecords(Entries)

    ][Results]

关于

获取 Zip 压缩包里面的文件内容。返回文件列表

参数	类型	说明
path	`可选` \| text	存放Zip文件夹的路径

语法

(x as number)=>[ 
    t = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"},
    r = "",
    Int10To62= (n,r,e,p) => 
    if e = 0 then 
        (
            if p = 0 then
                r 
            else
                @Int10To62(Number.Mod(e,62),t{n}&r,Number.RoundDown(e/62),p-1)
        )
    else @Int10To62(Number.Mod(e,62),t{n}&r,Number.RoundDown(e/62),p), 
    Result = Int10To62(Number.Mod(x,62),"",Number.RoundDown(x/62),1)
][Result]

关于

用于微博Mid、Uid、Url互相转换时，调用的进制转换函数。

语法

(str as text)=>[
    t = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"},
    e = Text.ToList(str),
    s = List.Count(e)-1,
    x1 = List.Sum(List.Transform({0..s},each List.PositionOf(t,e{_})*Number.Power(62,Number.Abs(_-s))))
][x1]

关于

用于微博Mid、Uid、Url互相转换时，调用的进制转换函数。

语法

//此函数需要调用：Weibo_Str62To10 进制转换函数
(str as text)=>[
    a = List.Transform(List.Split(List.Reverse(Text.ToList(str)),4),each Number.ToText(Weibo_Str62To10(Text.Combine(List.Reverse(_))))),
    r = Value.FromText(Text.Combine(List.Reverse(List.Transform(a,each Text.Repeat("0",7-Text.Length(_))&_))))
][r]

关于

用于微博Url转Mid的函数。

语法

//此函数需要调用：Weibo_Int10To62 进制转换函数
(mid as text)=>[
    a = List.Reverse(List.Transform(List.Split(List.Reverse(Text.ToList(mid)),7),each Weibo_Int10To62(Value.FromText(Text.Combine(List.Reverse(_)))))),
    r = Text.Combine(a),//原来的算法，K后面的0会被忽略，导致链接出错
    x = if Text.Length(r)=8 and Text.StartsWith(r,"K") then "K0"&Text.End(r,7) else r
][x]

关于

用于微博Mid转Url的函数。

语法

(x as number)=>[ 
    t = Text.ToList("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"),
    r = "",
    Int10To62= (n,r,e,p) => 
    if e = 0 then 
        (
            if p = 0 then
                r 
            else
                @Int10To62(Number.Mod(e,62),t{n}&r,Number.RoundDown(e/62),p-1)
        )
    else @Int10To62(Number.Mod(e,62),t{n}&r,Number.RoundDown(e/62),p), 
    Result = Int10To62(Number.Mod(x,62),"",Number.RoundDown(x/62),1)
][Result]

语法

(str as text)=>[
    t = Text.ToList("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"),
    e = Text.ToList(str),
    s = List.Count(e)-1,
    x1 = List.Sum(List.Transform({0..s},each List.PositionOf(t,e{_})*Number.Power(62,Number.Abs(_-s))))
][x1]

语法

/*
本函数，参考知乎大佬算法↓
作者：mcfx
链接：https://www.zhihu.com/question/381784377/answer/1099438784
来源：知乎著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。
*/
(x as number)=>[
    Bilibili = RecordArray[Bilibili],
    x = Number.BitwiseXor(x,Bilibili[xor]) + Bilibili[add],
    x1 = List.Transform(Bilibili[i], each 
        Bilibili[tr]{Number.Mod(Number.RoundDown(x/Number.Power(58,_)),58)}
    ),
    x2 = Record.ToTable(Bilibili[r]&Record.FromList(x1,Bilibili[st])),
    更改的类型 = Table.TransformColumnTypes(x2,{{"Name", Int64.Type}}),
    排序的行 = Table.Sort(更改的类型,{{"Name", Order.Ascending}}),
    x3 = Text.Combine(排序的行[Value],"")
][x3]

语法

/*
本函数，参考知乎大佬算法↓
作者：mcfx
链接：https://www.zhihu.com/question/381784377/answer/1099438784
来源：知乎著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。
*/
(x as text)=>[
    Bilibili = RecordArray[Bilibili],
    str = Text.ToList(x),
    x1 = List.Transform(Bilibili[i], each 
        List.PositionOf(Bilibili[tr],str{Bilibili[s]{_}})*Number.Power(58,_)
    ),
    x2 = Number.BitwiseXor((List.Sum(x1)-Bilibili[add]),Bilibili[xor])
][x2]

记录值数组

let
    RecordArray = [
        //哔哩哔哩AV号转BV号时使用
        Bilibili = [
            tr = Text.ToList("fZodR9XQDSUm21yCkr6zBqiveYah8bt4xsWpHnJE7jL5VG3guMTKNPAwcF"),
            s = {11, 10, 3, 8, 4, 6},
            st = {"11", "10", "3", "8", "4", "6"},
            xor = 177451812,
            add = 8728348608,
            i = {0..5},
            r = [0="B",1="V",2="1",5="4",7="1",9="7"]
        ]
    ]
in
    RecordArray

正文完

发表至：数据治理与分析

2024-05-11

0

Copyright notice: Our original article, by binbin 2024-05-11 publish, total 9582 words.

转载说明：除特殊说明外本站文章皆由CC-4.0协议发布，转载请注明出处。

数据分析：数据领域大维度细分库和工具推荐

Power Query常用自定义M函数 v1.0

数据分析：个人常用数据分析相关库库

数据分析：从数据处理到可视化库选择指南

📑 数据治理与Python API 开发教程

Power Query常用自定义M函数 v1.0

文本函数

Text_Contains | 文本_是否包含

语法

关于

示例 1

示例 2

示例 3

Text_ContainsAny | 文本_是否包含任意

语法

关于

示例 1

示例 2

示例 3

Text_In | 文本_是否等于任意

语法

关于

示例 1

示例 2

Text_Frequency | 文本_出现频率

语法

关于

示例 1

Text_FrequencyAll | 文本_出现频率全部

语法

关于

示例 1

Text_FrequencyAllToText | 文本_出现频率全部到文本

语法

关于

示例 1

数据获取函数

Data_UnZip | 数据_解压Zip文件

语法

关于

Data_UnZipPath | 数据_解压文件夹内Zip文件

语法

关于

转换函数

Weibo_Int10To62 | 微博_10转62进制

语法

关于

Weibo_Str62To10 | 微博_62转10进制

语法

关于

Weibo_UrlToMid | 微博_Url转Mid

语法

关于

Weibo_MidToUrl | 微博_Mid转Url

语法

关于

SPID_Int10To62 | SPID_10转62进制

语法

SPID_Str62To10 | SPID_62转10进制

语法

Bilibili_AvToBv | 哔哩哔哩_Av号转Bv号

语法

Bilibili_BvToAv | 哔哩哔哩_Bv号转Av号

语法

记录值[函数调用]

RecordArray

记录值数组

Image Splitter：免费在线图片分割工具，支持九宫格和多种分割模式

Image Splitter：免费在线图片分割工具，支持九宫格和多种分割模式